106 research outputs found

    The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic

    Get PDF
    Twitter is a free social networking and micro-blogging service that enables its millions of users to send and read each other's “tweets,” or short, 140-character messages. The service has more than 190 million registered users and processes about 55 million tweets per day. Useful information about news and geopolitical events lies embedded in the Twitter stream, which embodies, in the aggregate, Twitter users' perspectives and reactions to current events. By virtue of sheer volume, content embedded in the Twitter stream may be useful for tracking or even forecasting behavior if it can be extracted in an efficient manner. In this study, we examine the use of information embedded in the Twitter stream to (1) track rapidly-evolving public sentiment with respect to H1N1 or swine flu, and (2) track and measure actual disease activity. We also show that Twitter can be used as a measure of public interest or concern about health-related events. Our results show that estimates of influenza-like illness derived from Twitter chatter accurately track reported disease levels

    GET WELL: an automated surveillance system for gaining new epidemiological knowledge

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The assumption behind the presented work is that the information people search for on the internet reflects the disease status in society. By having access to this source of information, epidemiologists can get a valuable complement to the traditional surveillance and potentially get new and timely epidemiological insights. For this purpose, the Swedish Institute for Infectious Disease Control collaborates with a medical web site in Sweden.</p> <p>Methods</p> <p>We built an application consisting of two conceptual parts. One part allows for trends, based on user specified requests, to be extracted from anonymous web query data from a Swedish medical web site. The second conceptual part permits tailored analyses of particular diseases, where more complex statistical methods are applied to the data. To evaluate the epidemiological relevance of the output, we compared Google search data and search data from the medical web site.</p> <p>Results</p> <p>In the paper, we give concrete examples of the output from the web query-based system. We also present results from the comparison between data from the search engine Google and search data from the national medical web site.</p> <p>Conclusions</p> <p>The application is in regular use at the Swedish Institute for Infectious Disease Control. A system based on web queries is flexible in that it can be adapted to any disease; we get information on other individuals than those who seek medical care; and the data do not suffer from reporting delays. Although Google data are based on a substantially larger search volume, search patterns obtained from the medical web site may still convey more information from an epidemiological perspective. Furthermore we can see advantages with having full access to the raw data.</p

    Measuring the impact of health policies using Internet search patterns: the case of abortion

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Internet search patterns have emerged as a novel data source for monitoring infectious disease trends. We propose that these data can also be used more broadly to study the impact of health policies across different regions in a more efficient and timely manner.</p> <p>Methods</p> <p>As a test use case, we studied the relationships between abortion-related search volume, local abortion rates, and local abortion policies available for study.</p> <p>Results</p> <p>Our initial integrative analysis found that, both in the US and internationally, the volume of Internet searches for abortion is inversely proportional to local abortion rates and directly proportional to local restrictions on abortion.</p> <p>Conclusion</p> <p>These findings are consistent with published evidence that local restrictions on abortion lead individuals to seek abortion services outside of their area. Further validation of these methods has the potential to produce a timely, complementary data source for studying the effects of health policies.</p

    Prediction of Dengue Incidence Using Search Query Surveillance

    Get PDF
    Improvements in surveillance, prediction of outbreaks and the monitoring of the epidemiology of dengue virus in countries with underdeveloped surveillance systems are of great importance to ministries of health and other public health decision makers who are often constrained by budget or man-power. Google Flu Trends has proven successful in providing an early warning system for outbreaks of influenza weeks before case data are reported. We believe that there is greater potential for this technique for dengue, as the incidence of this pathogen can vary by a factor of ten in some settings, making prediction all the more important in public health planning. In this paper, we demonstrate the utility of Google search terms in predicting dengue incidence in Singapore and Bangkok, Thailand using several regression techniques. Incidence data were provided by the Singapore Ministry of Health and the Thailand Bureau of Epidemiology. We find our models predict incident cases well (correlation greater than 0.8) and periods of high incidence equally well (AUC greater than 0.95). All data and analysis code used in our study are available free online and can be adapted to other settings

    A multi-level geographical study of Italian political elections from Twitter Data

    Get PDF
    In this paper we present an analysis of the behavior of Italian Twitter users during national political elections. We monitor the volumes of the tweets related to the leaders of the various political parties and we compare them to the elections results. Furthermore, we study the topics that are associated with the co-occurrence of two politicians in the same tweet. We cannot conclude, from a simple statistical analysis of tweet volume and their time evolution, that it is possible to precisely predict the election outcome (or at least not in our case of study that was characterized by a “too-close-to-call” scenario). On the other hand, we found that the volume of tweets and their change in time provide a very good proxy of the final results. We present this analysis both at a national level and at smaller levels, ranging from the regions composing the country to macro-areas (North, Center, South)

    Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees

    Get PDF
    The spread of infectious diseases crucially depends on the pattern of contacts among individuals. Knowledge of these patterns is thus essential to inform models and computational efforts. Few empirical studies are however available that provide estimates of the number and duration of contacts among social groups. Moreover, their space and time resolution are limited, so that data is not explicit at the person-to-person level, and the dynamical aspect of the contacts is disregarded. Here, we want to assess the role of data-driven dynamic contact patterns among individuals, and in particular of their temporal aspects, in shaping the spread of a simulated epidemic in the population. We consider high resolution data of face-to-face interactions between the attendees of a conference, obtained from the deployment of an infrastructure based on Radio Frequency Identification (RFID) devices that assess mutual face-to-face proximity. The spread of epidemics along these interactions is simulated through an SEIR model, using both the dynamical network of contacts defined by the collected data, and two aggregated versions of such network, in order to assess the role of the data temporal aspects. We show that, on the timescales considered, an aggregated network taking into account the daily duration of contacts is a good approximation to the full resolution network, whereas a homogeneous representation which retains only the topology of the contact network fails in reproducing the size of the epidemic. These results have important implications in understanding the level of detail needed to correctly inform computational models for the study and management of real epidemics

    Web Queries as a Source for Syndromic Surveillance

    Get PDF
    In the field of syndromic surveillance, various sources are exploited for outbreak detection, monitoring and prediction. This paper describes a study on queries submitted to a medical web site, with influenza as a case study. The hypothesis of the work was that queries on influenza and influenza-like illness would provide a basis for the estimation of the timing of the peak and the intensity of the yearly influenza outbreaks that would be as good as the existing laboratory and sentinel surveillance. We calculated the occurrence of various queries related to influenza from search logs submitted to a Swedish medical web site for two influenza seasons. These figures were subsequently used to generate two models, one to estimate the number of laboratory verified influenza cases and one to estimate the proportion of patients with influenza-like illness reported by selected General Practitioners in Sweden. We applied an approach designed for highly correlated data, partial least squares regression. In our work, we found that certain web queries on influenza follow the same pattern as that obtained by the two other surveillance systems for influenza epidemics, and that they have equal power for the estimation of the influenza burden in society. Web queries give a unique access to ill individuals who are not (yet) seeking care. This paper shows the potential of web queries as an accurate, cheap and labour extensive source for syndromic surveillance

    Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance

    Get PDF
    A variety of obstacles, including bureaucracy and lack of resources, delay detection and reporting of dengue and exist in many countries where the disease is a major public health threat. Surveillance efforts have turned to modern data sources such as Internet usage data. People often seek health-related information online and it has been found that the frequency of, for example, influenza-related web searches as a whole rises as the number of people sick with influenza rises. Tools have been developed to help track influenza epidemics by finding patterns in certain web search activity. However, few have evaluated whether this approach would also be effective for other diseases, especially those that affect many people, that have severe consequences, or for which there is no vaccine. In this study, we found that aggregated, anonymized Google search query data were also capable of tracking dengue activity in Bolivia, Brazil, India, Indonesia and Singapore. Whereas traditional dengue data from official sources are often not available until after a long delay, web search query data is available for analysis within a day. Therefore, because it could potentially provide earlier warnings, these data represent a valuable complement to traditional dengue surveillance

    Assessing the impact of a health intervention via user-generated Internet content

    Get PDF
    Assessing the effect of a health-oriented intervention by traditional epidemiological methods is commonly based only on population segments that use healthcare services. Here we introduce a complementary framework for evaluating the impact of a targeted intervention, such as a vaccination campaign against an infectious disease, through a statistical analysis of user-generated content submitted on web platforms. Using supervised learning, we derive a nonlinear regression model for estimating the prevalence of a health event in a population from Internet data. This model is applied to identify control location groups that correlate historically with the areas, where a specific intervention campaign has taken place. We then determine the impact of the intervention by inferring a projection of the disease rates that could have emerged in the absence of a campaign. Our case study focuses on the influenza vaccination program that was launched in England during the 2013/14 season, and our observations consist of millions of geo-located search queries to the Bing search engine and posts on Twitter. The impact estimates derived from the application of the proposed statistical framework support conventional assessments of the campaign
    corecore